NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Expanding Russian PropBank: Challenges and Insights for Developing new SRL Resources

Myers, Skatje; Khamov, Roman; Pollins, Adam; Tozier, Rebekah; Babko-Malaya, Olga; Palmer, Martha (May 2024, ELRA and ICCL)
Bonial, Claire; Bonn, Julia; Hwang, Jena D (Ed.)
Semantic role labeling (SRL) resources, such as Proposition Bank (PropBank), provide useful input to downstream applications. In this paper we present some challenges and insights we learned while expanding the previously developed Russian PropBank. This new effort involved annotation and adjudication of all predicates within a subset of the prior work in order to provide a test corpus for future applications. We discuss a number of new issues that arose while developing our PropBank for Russian as well as our solutions. Framing issues include: distinguishing between morphological processes that warrant new frames, differentiating between modal verbs and predicate verbs, and maintaining accurate representations of a given language’s semantics. Annotation issues include disagreements derived from variability in Universal Dependency parses and semantic ambiguity within the text. Finally, we demonstrate how Russian sentence structures reveal inherent limitations to PropBank’s ability to capture semantic data. These discussions should prove useful to anyone developing a PropBank or similar SRL resources for a new language.
more » « less
Full Text Available
UMR-Writer 2.0: Incorporating a New Keyboard Interface and Workflow into UMR-Writer

Ge, Sijia; Wright-Bettner, Kristin; Myers, Skatje; Xue, Nianwen; Palmer, Martha (July 2023, Proceedings of the 17th Linguistic Annotation Workshop (LAW-XVII))

UMR-Writer is a web-based tool for annotating semantic graphs with the Uniform Meaning Representation (UMR) scheme. UMR is a graph-based semantic representation that can be applied cross-linguistically for deep semantic analysis of texts. In this work, we implemented a new keyboard interface in UMR-Writer 2.0, which is a powerful addition to the original mouse interface, supporting faster annotation for more experienced annotators. The new interface also addresses issues with the original mouse interface. Additionally, we demonstrate an efficient workflow for annotation project management in UMR-Writer 2.0, which has been applied to many projects.
more » « less
Full Text Available
UMR-Writer 2.0: Incorporating a New Keyboard Interface and Workflow into UMR-Writer

https://doi.org/10.18653/v1/2023.law-1.21

Ge, Sijia; Zhao, Jin; Wright-Bettner, Kristin; Myers, Skatje; Xue, Nianwen; Palmer, Martha (July 2023, The 17th Linguistic Annotation Workshop (LAW-XVII))

Full Text Available
Building a Broad Infrastructure for Uniform Meaning Representations

Bonn, Juli; Buchholz, Matthew J; Chun, Jayeol; Cowell, Andrew; Croft, William; Denk, Lukas; Ge, Sijia; Hajič, Jan; Lai, Kenneth; Martin, James H; et al (May 2024, ELRA and ICCL)
Calzolari, Nicoletta; Kan, Min-Yen; Hoste, Veronique; Lenci, Alessandro; Sakti, Sakriani; Xue, Nianwen (Ed.)
This paper reports the first release of the UMR (Uniform Meaning Representation) data set. UMR is a graph-based meaning representation formalism consisting of a sentence-level graph and a document-level graph. The sentence-level graph represents predicate-argument structures, named entities, word senses, aspectuality of events, as well as person and number information for entities. The document-level graph represents coreferential, temporal, and modal relations that go beyond sentence boundaries. UMR is designed to capture the commonalities and variations across languages and this is done through the use of a common set of abstract concepts, relations, and attributes as well as concrete concepts derived from words from invidual languages. This UMR release includes annotations for six languages (Arapaho, Chinese, English, Kukama, Navajo, Sanapana) that vary greatly in terms of their linguistic properties and resource availability. We also describe on-going efforts to enlarge this data set and extend it to other genres and modalities. We also briefly describe the available infrastructure (UMR annotation guidelines and tools) that others can use to create similar data sets.
more » « less
Full Text Available
Building a Broad Infrastructure for Uniform Meaning Representations

Bonn, Julia; Buchholz, Matthew J; Chun, Jayeol; Cowell, Andrew; Croft, William; Denk, Lukas; Ge, Sijia; Hajič, Jan; Lai, Kenneth; Martin, James H; et al (May 2024, ELRA and ICCL)
Calzolari, Nicoletta; Kan, Min-Yen; Hoste, Veronique; Lenci, Alessandro; Sakti, Sakriani; Xue, Nianwen (Ed.)
This paper reports the first release of the UMR (Uniform Meaning Representation) data set. UMR is a graph-based meaning representation formalism consisting of a sentence-level graph and a document-level graph. The sentence-level graph represents predicate-argument structures, named entities, word senses, aspectuality of events, as well as person and number information for entities. The document-level graph represents coreferential, temporal, and modal relations that go beyond sentence boundaries. UMR is designed to capture the commonalities and variations across languages and this is done through the use of a common set of abstract concepts, relations, and attributes as well as concrete concepts derived from words from invidual languages. This UMR release includes annotations for six languages (Arapaho, Chinese, English, Kukama, Navajo, Sanapana) that vary greatly in terms of their linguistic properties and resource availability. We also describe on-going efforts to enlarge this data set and extend it to other genres and modalities. We also briefly describe the available infrastructure (UMR annotation guidelines and tools) that others can use to create similar data sets.
more » « less
Full Text Available
PropBank Comes of Age—Larger, Smarter, and more Diverse

https://doi.org/10.18653/v1/2022.starsem-1.24

Pradhan, Sameer; Bonn, Julia; Myers, Skatje; Conger, Kathryn; O’gorman, Tim; Gung, James; Wright-bettner, Kristin; Palmer, Martha (July 2022, Proceedings of the 11th Joint Conference on Lexical and Computational Semantics)
Vivi Nastase; Ellie Pavlick; Mohammad Taher Pilehvar; Jose Camacho-Collados; Alessandro Raganato (Ed.)
This paper describes the evolution of the PropBank approach to semantic role labeling over the last two decades. During this time the PropBank frame files have been expanded to include non-verbal predicates such as adjectives, prepositions and multi-word expressions. The number of domains, genres and languages that have been PropBanked has also expanded greatly, creating an opportunity for much more challenging and robust testing of the generalization capabilities of PropBank semantic role labeling systems. We also describe the substantial effort that has gone into ensuring the consistency and reliability of the various annotated datasets and resources, to better support the training and evaluation of such systems
more » « less
Full Text Available
Mapping AMR to UMR: Resources for Adapting Existing Corpora for Cross-Lingual Compatibility

Bonn, Julia; Myers Skatje; Van Gysel, Jens E.; Denk, Lukas; Vigus, Meagan; Zhao, Jin; Cowell, Andrew; Croft, William; Hajic, Jan; Martin, James H; et al (March 2023, The 21st International Workshop on Treebanks and Linguistic Theories (TLT, GURT/SyntaxFest 2023))

Full Text Available
Mapping AMR to UMR: Resources for Adapting Existing Corpora for Cross-Lingual Compatibility

Bonn, Julia; Myers, Skatje; Van Gysel, Jens E.; Denk, Lukas; Vigus, Meagan; Zhao, Jin; Cowell, Andrew; Croft, William; Hajic, Jan; Martin, James H.; et al (March 2023, Proceedings of the 21st International Workshop on Treebanks and Linguistic Theories (TLT, GURT/SyntaxFest 2023))

This paper presents detailed mappings between the structures used in Abstract Meaning Representation (AMR) and those used in Uniform Meaning Representation (UMR). These structures include general semantic roles, rolesets, and concepts that are largely shared between AMR and UMR, but with crucial differences. While UMR annotation of new low-resource languages is ongoing, AMR-annotated corpora already exist for many languages, and these AMR corpora are ripe for conversion to UMR format. Rather than focusing on semantic coverage that is new to UMR (which will likely need to be dealt with manually), this paper serves as a resource (with illustrated mappings) for users looking to understand the fine-grained adjustments that have been made to the representation techniques for semantic categories present in both AMR and UMR.
more » « less
Full Text Available
Fine-grained Information Extraction from Biomedical Literature based on Knowledge-enriched Abstract Meaning Representation

https://doi.org/10.18653/v1/2021.acl-long.489

Zhang, Zixuan; Parulian, Nikolaus Nova; Ji, Heng; Elsayed, Ahmed S.; Myers, Skatje; Palmer, Martha (August 2021, Proc. The Joint Conference of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (ACL-IJCNLP 2021))

Full Text Available
Russian PropBank

Moeller, Sarah; Wagner, Irina; Palmer, Martha; Conger, Kathryn; Myers, Skatje (May 2020, Proceedings of the 12th Conference on Language Resources and Evaluation (LREC 2020),)

This paper presents a proposition bank for Russian (RuPB), a resource for semantic role labeling (SRL). The motivating goal for this resource is to automatically project semantic role labels from English to Russian. This paper describes frame creation strategies, coverage, and the process of sense disambiguation. It discusses language-specific issues that complicated the process of building the PropBank and how these challenges were exploited as language-internal guidance for consistency and coherence.
more » « less
Full Text Available

« Prev Next »

Search for: All records